A new variable selection method for classification

نویسنده

  • Silvia Casado
چکیده

This work proposes an “ad hoc” new method for variable selection in classification, specifically in Discriminant Analysis. This new method is based on the metaheuristic strategy Tabu Search. From a computational point of view variable selection is a NP-Hard problem and therefore there is no guarantee of finding the optimum solution (NP = Nondeterministic Polynomial Time). This means that when the size of the problem is large finding an optimum solution in practice is unfeasible. As found in other optimization problems, metaheuristic techniques have proved to be good at solving this type of problems. Although there are many references in the literature regarding selecting variables for their use in classification, there are very few key references on the selection of variables for their use in Discriminant Analysis. In fact, the most well-known statistical packages continue to use classic selection methods as Stepwise, Backward or Forward. After performing some tests it is found that Tabu Search obtains significantly better results than the Stepwise, Backward or Forward methods used by classic statistical packages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Framework for Distributed Multivariate Feature Selection

Feature selection is considered as an important issue in classification domain. Selecting a good feature through maximum relevance criterion to class label and minimum redundancy among features affect improving the classification accuracy. However, most current feature selection algorithms just work with the centralized methods. In this paper, we suggest a distributed version of the mRMR featu...

متن کامل

A New Knowledge-Based System for Diagnosis of Breast Cancer by a combination of the Affinity Propagation and Firefly Algorithms

Breast cancer has become a widespread disease around the world in young women. Expert systems, developed by data mining techniques, are valuable tools in diagnosis of breast cancer and can help physicians for decision making process. This paper presents a new hybrid data mining approach to classify two groups of breast cancer patients (malignant and benign). The proposed approach, AP-AMBFA, con...

متن کامل

A Novel Scheme for Improving Accuracy of KNN Classification Algorithm Based on the New Weighting Technique and Stepwise Feature Selection

K nearest neighbor algorithm is one of the most frequently used techniques in data mining for its integrity and performance. Though the KNN algorithm is highly effective in many cases, it has some essential deficiencies, which affects the classification accuracy of the algorithm. First, the effectiveness of the algorithm is affected by redundant and irrelevant features. Furthermore, this algori...

متن کامل

SFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy

 In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....

متن کامل

A New Hybrid Method for Improving the Performance of Myocardial Infarction Prediction

Abstract Introduction: Myocardial Infarction, also known as heart attack, normally occurs due to such causes as smoking, family history, diabetes, and so on. It is recognized as one of the leading causes of death in the world. Therefore, the present study aimed to evaluate the performance of classification models in order to predict Myocardial Infarction, using a feature selection method tha...

متن کامل

A Comparison between New Estimation and variable Selectiion method in Regression models by Using Simulation

In this paper some new methods whitch very recently have been introduced for parameter estimation and variable selection in regression models are reviewd. Furthermore , we simulate several models in order to evaluate the performance of these methods under diffrent situation. At last we compare the performance of these methods with that of the regular traditional variable selection methods such ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007